Soft decision strategy and adaptive compensation for robust speech recognition against impulsive noise
نویسنده
چکیده
This paper presents research on robust automatic speech recognition (ASR) in the presence of impulsive noise, which is usually caused by transmission errors or packet loss in network-based delivery of speech signals. A soft decision strategy is proposed by analyzing the degraded observation probabilities caused by impulsive noise. Based on the soft decision results, two compensation methods are developed. The first aims at suppressing the unreliable likelihood scores by flooring the observation probabilities (FOP) on sensitive feature components with an adaptive threshold. The second focuses on the recovery of corrupted features and the unreliability of reconstructed data can be further compensated by the flooring method. Evaluation results on the Aurora connected digits database show that the proposed methods significantly improve the recognition robustness against impulsive noise. For example at the occurrence rate of 50% in simulated impulsive noise environment, the accuracy is increased from 42.74% of the baseline to 85.35%.
منابع مشابه
A Robust Distributed Estimation Algorithm under Alpha-Stable Noise Condition
Robust adaptive estimation of unknown parameter has been an important issue in recent years for reliable operation in the distributed networks. The conventional adaptive estimation algorithms that rely on mean square error (MSE) criterion exhibit good performance in the presence of Gaussian noise, but their performance drastically decreases under impulsive noise. In this paper, we propose a rob...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملComparison of Voice Activity Detectors for Interview Speech in NIST Speaker Recognition Evaluation
Interview speech has become an important part of the NIST Speaker Recognition Evaluations (SREs). Unlike telephone speech, interview speech has substantially lower signal-to-noise ratio, which necessitates robust voice activity detection (VAD). This paper highlights the characteristics of interview speech files in NIST SREs and discusses the difficulties in performing speech/nonspeech segmentat...
متن کاملA study of voice activity detection techniques for NIST speaker recognition evaluations
Since 2008, interview-style speech has become an important part of the NIST Speaker Recognition Evaluations (SREs). Unlike telephone speech, interview speech has lower signal-to-noise ratio, which necessitates robust voice activity detectors (VADs). This paper highlights the characteristics of interview speech files in NIST SREs and discusses the difficulties in performing speech/non-speech seg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005